Parallel H.264 Decoding on an Embedded Multicore Processor
نویسندگان
چکیده
In previous work the 3D-Wave parallelization strategy was proposed to increase the parallel scalability of H.264 video decoding. This strategy is based on the observation that inter-frame dependencies have a limited spatial range. The previous results, however, investigate application scalability on an idealized multiprocessor. This work presents an implementation of the 3D-Wave strategy on a multicore architecture composed of NXP TriMedia TM3270 embedded processors. The results show that the parallel H.264 implementation scales very well, achieving a speedup of more than 54 on a 64-core processor. Potential drawbacks of the 3D-Wave strategy are that the memory requirements increase since there can be many frames in flight, and that the latencies of some frames might increase. To address these drawbacks, policies to reduce the number of frames in flight and the frame latency are also presented. The results show that our policies combat memory and latency issues with a negligible effect on the performance scalability.
منابع مشابه
A Hardware Task Scheduler for Embedded Video Processing
Modern embedded Systems-on-a-Chip deploy multiple programmable cores to meet increasing performance requirements of video, graphics, and modem applications. However, software implementations of task scheduling and inter-task synchronization often limit performance improvements of multicores. Remarkably, several demanding video applications (e.g. H.264 video decoding) rely on task dependency gra...
متن کاملParallel Architecture Core (PAC) - the First Multicore Application Processor SoC in Taiwan Part II: Application Programming
Two representative multimedia applications— AAC and H.264/AVC decoders on the parallel architecture core (PAC) SoC are introduced in the second part of the two introductory papers. The applications have been programmed on the PACDSP core and the PAC SoC to demonstrate the high-performance, low-power DSP computations and the effectiveness of the dynamic voltage and frequency scaling (DVFS) capab...
متن کاملA Cache-Aware Strategy for H.264 Decoding on Multi-processor Architectures
H.264 is one of the most commonly used formats for the recording, compression and distribution of video. Encoders and decoders for the H.264 standard are widely in demand, and efficient strategies for enhancing their performance have been areas of active research. With the proliferation of many core architectures in the embedded community, there has been a trend towards parallelizing implementa...
متن کاملImplementation of a Coarse-Grained Reconfigurable Media Processor for AVC Decoder
ADRES (Architecture for Dynamically Reconfigurable Embedded Systems) is a templatized coarse-grained reconfigurable processor architecture. It targets at embedded applications which demand high-performance, low-power and high-level language programmability. Compared with typical VLIW-based DSP, ADRES can exploit higher parallelism by using more scalable hardware with support of novel compilatio...
متن کاملApplication Specific Processor Design for H.264 Decoder with a Configurable Embedded Processor
Jin Ho Han et al. 491 An application specific processor for an H.264 decoder with a configurable embedded processor is designed in this research. The motion compensation, inverse integer transform, inverse quantization, and entropy decoding algorithm of H.264 decoder software are optimized. We improved the performance of the processor with instruction-level hardware optimization, which is tailo...
متن کامل